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SUMMARY 

We report a new method to infer continuous time series of the declination, inclination and 
intensity of the magnetic field from archeomagnetic data. Adopting a Bayesian perspec¬ 
tive, we need to specify a priori knowledge about the time evolution of the magnetic field. 
It consists in a time correlation function that we choose to be compatible with present 
knowledge about the geomagnetic time spectra. The results are presented as distributions 
of possible values for the declination, inclination or intensity. We find that the methodol¬ 
ogy can be adapted to account for the age uncertainties of archeological artefacts and we 
use Markov Chain Monte Carlo to explore the possible dates of observations. We apply 
the method to intensity datasets from Mari, Syria and to intensity and directional datasets 
from Paris, France. Our reconstructions display more rapid variations than previous stud¬ 
ies and we find that the possible values of geomagnetic field elements are not necessarily 
normally distributed. Another output of the model is better age estimates of archeological 
artefacts. 
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1 INTRODUCTION 


From 1840 onward, continuous records from ground-based observatories are available and make it 
possible to characterize the time derivative of the main field, or secular variation, as a function of 
length-scale (Holme et al.] |2011[ ). The spectral properties of magnetic series obtained from these in¬ 
strumental records can be transposed into prior information on core processes in the framework of 
stochastic processes (Gillet et al. 2013| ). 

Before the first direct measurements, direction and intensity of the magnetic field can be inferred 
from remanent magnetization of sediments, volcanic deposits or archeological artifacts. The sparse 
repartition of archeomagnetic data in space and time and their associated large measurement and dating 
uncertainties limit our ability to recover the spatio-temporal variations of the geomagnetic field over 
the past few millennia. Strong efforts have nevertheless been made to calculate time-dependent global 
models of the archeomagnetic field from these data ( |Korte and Constable|2003[ [Korte et al.|2009[|Licht| 
et al. |2013| ). To take advantage of the large amount of data and the relatively dense temporal coverage 


available in some areas, for instance in Western Europe (Donadini et al. 2009| ), archeomagnetic data are 
also used to construct regional curves (so-called master curves) that describe the temporal behavior 
of the magnetic field ( |Le Goff et al.||2002t |Lanos et akl|2005[ jThebault and Gallet||2010| ). Beyond 
information on processes occurring in the core, master curves provide useful tools for archeomagnetic 
dating. 

In this study, we focus on the construction of regional archeomagnetic models describing the time 
evolution of the declination, inclination and intensity of the Earth’s magnetic field over the past 6000 
years. To compensate for the uneven repartition of data and their large uncertainties and to reduce 
the non-uniqueness, the construction of such models usually incorporates a regularization in time 
that consists in penalizing the second time derivative of the field (Bloxham and Jackson |T992| ). Such 
regularizations, however, arbitrarily smooth the reconstructed time fluctuations. Instead, we rely here 
on a Gaussian process regression method based on prior information extrapolated from the statistical 
properties of models obtained from satellite and observatory data. 

Furthermore, dating uncertainties in archeomagnetic data are an important source of errors in the 
construction of master curves and most inversion methods do not directly account for them. Indeed, 
dating errors are often converted into equivalent measurement errors (Korte et al. 2005| ); alternatively, 
they are estimated using bootstrap or jack-knife methods, which consist in investigating the variability 
of models obtained from an ensemble of randomly noised and/or sub-sampled datasets ( Korte et al. 


2009). Here, we use Markov Chain Monte Carlo (MCMC) methods for the dates at which observations 


have been obtained, based on the probability inherent to the Gaussian process method. 

This paper is divided into 5 sections. We present in the next section the Gaussian process regres- 




































Stochastic modeling of regional archeomagnetic series 3 

sion framework, our choice of prior information for the model parameters, the use of Markov Chain 
Monte Carlo on observation dates and a robust measure of data errors in order to decrease the effect of 
outliers. The method is tested using synthetic observations (section 3), before being applied to datasets 
from France and the Middle East (section 4). A discussion of our results and conclusions are presented 
in section 5. 
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2 METHOD 


We consider geomagnetic series as the realization of a stochastic process sampled through observa¬ 
tions. We use the Gaussian Process Regression (section 2.1) to couple the information contained in 
measurements with that from the a priori time covariance function of the process. To account for dating 
errors, we integrate the regression method into a Markov Chain Monte Carlo algorithm, as described 
in section 2.2. We define, in section 2.3, the a priori information on which relies the Gaussian process 
framework. Finally, in section 2.4, we show how to incorporate a robust measure of data errors in or¬ 
der to decrease the effect of outliers that appears when using a standard L2-measure with geophysical 
series. 


2.1 Gaussian process regression 

Let us consider a Gaussian, stationary stochastic process < p(t ) = Tp + (p f (t), defined by its average 
value Tp, the perturbation <p f (t) from this mean value and its covariance function: 

Cov(^(i), <p(t + r)) = E(<p'(t)<p'(t + t)) = <j 2 p(t ), (1) 

with cr 2 the variance and p the autocorrelation function of the process that will contain the a priori 
information on the model parameters ; the notation £'(...) stands for the statistical expectation. The 
continuous process p> is sampled with data stored at discrete times into a vector y, and estimated as 
a sequence of parameters stored into a vector m = m + m', with m the background model and m' 
the model perturbation. In our context we consider that the parameters in m are homogeneous to the 
observations in y (they are images of the same quantity). Vectors t y and t m contain respectively the 
epochs at which the data and the model are sampled. 

The estimate of the model m, given the data y and the measurement errors e, is characterized by 
the a posteriori expectation model 

m 4- C m y(Cyy H- C ee ) (y y), (2) 


and the a posteriori covariance matrix C*: 

r* — C -C (C +C ) -1 C T 

— ^mm ^myV^yy \ ^ee ) 


my 


( 3 ) 


(Rasmussen and Williams|2006). Here y is the prediction from the background model m at times t y , 
C ee = E(ee T ) is the data error covariance matrix. Matrices C mm , C my and C yy are derived from the 
autocorrelation function using expression 0: 


— (7 p{tmi tmj)\ ^-'rnyij — CT p(tmi ® PiPvi ^2/i) 


( 4 ) 
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Note that the above estimate ([2]) in term of Gaussian process comes down to calculating the BLUE 
(Best Linear Unbiased Estimator). 


2.2 Accounting for dating uncertainties with Markov Chain Monte Carlo 

Expression (|4j) assumes that each datum y L is representative of an epoch t yv Because dating uncertain¬ 
ties are prominent in archeomagnetic databases, we should consider the probability density function 
(pdf) for the date of the datum y h . This distribution depends on the dating method: it is generally con¬ 


sidered Gaussian for 14 C dating (e. g. Aguilar Reyes et al. 2013), uniform when the date is estimated 
from historical or archeological constraints (e.g. Genevey et al .][2003 ), or more complex in the case of 
calibrated 14 C dates (Reimer et al. 2009). 

To consider these dating uncertainties, we build several sets of dates t y , illustrated in Figure [1^. 
We associate at each record a date drawn inside its dating error bar. We estimate for each draw a model 
m y defined by a mean model and its covariances C* (equations ([2]) and ([3])) at times t y . We then 
evaluate the joint probability of the draw after |Lanos| ([2004 ), see also (Pavo mCarrasco et al.|2011| ): 


p(ty, y|m y ) oc p(m y \ty, y) x p(t y , y) (5) 

The integration of the probability density function over all possible values of y gives the posterior 
probabilities of the dates t y . 

/ +oo 

p{ty,y\m y )dy ( 6 ) 

-OO 

In practice, we first evaluate these probabilities for each record at time t yi . To this end we multiply 
the Gaussian posterior probability density function Af(m yi ,a yi ) of the model at time t yi (red curves 
in Figure [1J> and c), by the Gaussian prior probability density function Af(yi, ef) of measurement y L 
(blue curves in Figure [1J> and c). The notation Af(p, a) stands for Gaussian distribution with mean 
p and standard deviation a. The standard deviation a yi of the model at time t yi is obtained from the 
posterior covariance matrix C*. By this multiplication, we obtain the joint probability density function 
(green curve in Figure[l]:). We then integrate the obtained probability density function over all possible 
values of yi to get the probability of the date t yi . We finally multiply the posterior probabilities of all 
dates to obtain the probability of draw k, noted Pdraw fc - 

A natural way to proceed following [Lanos| ( |2004| ) is to weigh each mean model given the probabil¬ 
ity of the corresponding draw. However, few draws have very high probabilities compared to all others 
and numerous iterations provide very few representative mean models. To overcome this problem, we 
use Markov Chain Monte Carlo to explore the possible dates of observations and to select draws with 
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Figure 1. Illustration of equations ([5]) and ([6]): a) Syrian intensity dataset (black) together with one draw of 
random dates (red stars) ; b) Gaussian prior pdf of the measurements (blue curves) and Gaussian posterior pdf 
of the model (red curves) for five data from the inset in a) ; c) Gaussian prior pdf of the observation at 2900 BC 
(blue curve), Gaussian posterior pdf of the model (red curve), and combined pdf of the two (green curve). 
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the highest probabilities. We remind here the main steps, (see Aste r et al.| ( |2013| ); |Gilks et ah ( 1996a ) 
for more details): 

1- To explore the possible dates, we generate the k t h draw from the previous one as a random 
draw inside the proposal distribution J\f(tk-u ctmcmc) x p(t). For uniform probability distribution, it 
comes down to a random walk restricted to the a priori time interval. 

-Pdraw^, 


2- We define an acceptance ratio a = min(l, 5 ), with s = 


draw fc-i 


3- We keep the k t h draw if a > u, u being a random value obtained from a uniform distribution 
between 0 and 1, and reject it if not. 

We stop the chain after N iterations depending on the dataset studied and perform several chains to 
better explore the space of possible dates ([Gilks et al. 1 996bj , p 13). The number of chains is determined 
by the evolution of the posterior distribution of the dates. The number of accepted draws in each chain 
depends on ctmcmc- We adjust the latter parameter so the number of kept draws is between 20 and 
60% of all draws. All these informations are summarized in Appendix B. Each draw k selected by the 
above Markov rules consists in a set of dates t^, associated to records y and measurement errors e. 
From equations ([2]) and ([3]), we obtain for each dataset a mean model m and its associated covariances 
C* at the times t m . 

Expectations and a posteriori covariance matrices are used to build ensembles of models consistent 
with both the observations and the a priori information assumed for model parameters. To this end, we 
use the Choleski decomposition U of the a posteriori covariance matrix, C* = U T U, from which we 
compute an ensemble of model realizations m = m + U T m, with m a random Gaussian vector with 
zero mean and unit variance. Each draw selected by the Markov chain is used to build an ensemble of 
realizations. We put together all these ensembles to build our final estimate of the probability density 
function. Note that this distribution is not necessarily Gaussian. 


2.3 A priori covariances on geomagnetic series 

We detail below how we derive our covariances on geomagnetic series (intensity F, inclination /, dec¬ 
lination D) from a priori covariances on the Gauss spherical harmonic coefficients. These are chosen 
to be compatible with the temporal power spectral densities recorded in ground-based observatories 
dGillet et al.|20l3] >. 

We assume that all Gauss coefficients (g™, h™), with n and m the spherical harmonic degrees and 
orders, result from an auto-regressive (AR) process of order 2, with correlation function 

y/3 r \ 


' , \ 

1 H- - exp 

T c{n) 


r c {n) 


Pn(r) = 


(7) 
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Covariances for Gauss coefficients are then: 


Co+ r)) = a 2 {n)p n {r) = K n (r ), 


( 8 ) 


with a similar notation for h™ coefficients. The time r c and the variance a 2 are functions of the degree 
n only. We assume that there is no cross-correlations between Gauss coefficients of different degrees 
and orders, and between g and h as well. Note that this correlation function is solution of the stochastic 
differential equation ( Yaglo mj2004[ ): 


dip' 2y/3 7 , 3 , 7 , . 

d ——|- dip H —^(p dt = d£(£) , 

CiL T n 


(9) 


where ((t) is the Brownian motion (or Wiener process). 

Variances cr 2 (n ) for the non-dipole Gauss coefficients are obtained from the variance of the Gauss 
coefficients estimated in satellite field models, as in the models COV-OBS flGillet et al.|2013| ): 

<%( n ) = 2n X + l [dnitf + hn(t) 2 } . ( 10 ) 

m =0 

Using a similar definition for cr|(n), equation <[9j) imposes the value of the correlation time: 


a g\ n ) 


(ID 


The background model is composed of the axial dipole value g\ = —35/iT, and the variance for the 
dipole coefficients is chosen as a g ( 1) = 5pT 2 , the value typically found for the past 4000 years (Korte 


and Constable 


2011). Since crj(n) is not affected by the presence of a stationary background, we find 


a correlation time of about 200 years for all coefficients of degree one. 

We have propagated this a priori information on Gauss coefficients to geomagnetic series of dec¬ 
lination D , inclination / and intensity F recorded at the Earth’s surface. Our approach requires that 
these quantities have a Gaussian distribution. It has been shown that the intensity distribution was close 
to a Gaussian distribution in the limit of small relative dispersion (Love and Constable |2003| ). This is 
indeed the case for archeomagnetic data, since on centennial to millennial time-scales the standard 
deviation in the axial dipole is small compared to the average value. Assuming that Gauss coefficients 
are the result of a random stationary process and that they have a zero mean except for the axial dipole 
g i, we show in Appendix A how to obtain the mean, covariance and cross-covariance of geomagnetic 
series of D , / and F (equations ( |A.15| ) and ( A.16 )). Covariances depend on the colatitude 6 of the 
sampled site, on g® and on sums over degree n of the correlation function defined in equation ([ 7 ]). 
Note in particular that we find non-zero covariances between F and I. 

Studies carried out on magnetic series from paleomagnetic to archeomagnetic records suggest a 
continuous spectrum of the Virtual Axial Dipole Moment ( Constable and Johnson |2005 ; [Ziegler et al. 
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2011[ ), with slope decreasing from about zero on the longest periods towards about -2 at millennial 


periods. The analysis of models of Holocene lake sediment magnetic records (Panovska et a k|2013| ) 
has shown that temporal power spectra for declination, inclination and relative paleointensity from 
lake sediments data follow a power law with a slope —2.3 ± 0.6 for periods between 300 and 4000 
years. These findings are in good agreement with recent results obtained for the dipole moment from 


geodynamo numerical simulations (Olson et al.| |2012[ ), which also display steeper slopes at higher 
periods. 

The a priori information discussed above presents the advantage to require only a single parameter 
per degree (r c ). The slope of the temporal power spectrum for a process defined by equation ([9]) is by 
construction -4 at periods r <C r c ( |Gillet et al.|2013| ), which agrees with that obtained for observatory 
series (De Santis et a l.|2003| ). We illustrate in Figure[2]that we retrieve the -4 slope for spectra of the 
auto-correlation functions for F, D and /, obtained with equation (A. 15) - the square of the power 
spectrum for a series c p(t ) is the power spectrum of its covariance function Cov(<p(t); <p(t + r)). The 
choice of a priori information in the present study is particularly important for periods shorter than a 
few hundred of years. Indeed, archeomagnetic data being sparse in time, it is towards high frequencies 
that we need to buttress the evidence from observations with prior information. 


2.4 Dealing with outliers using robust measures of the data errors 

The methodology developed hitherto relies on a L2-norm to account for measurement errors, which 
makes the approach vulnerable to large errors. Outliers to the L2-norm are unfortunately a common 
feature of archeomagnetic data analyses (Suttie et al7| |2011 ). To decrease the effect of these outliers 
when using L2-norms, Donadini et al. ( [2009] ) assigned to all data a minimum value for the measure¬ 
ment errors (5/iT for intensity data and 4.3° for directional data). We can instead modify the measure 
of the misfit to observations and replace the L2-norm with the Huber norm, which distribution is 
defined as: (see Farqu harson and O ldenburg (1998)): 


P(r) = 2 


( 12 ) 


ex p( — t) 

exp(— c\r\ + y) 

with N = 2.6046 for c = 1.5 in this study and r, the normalized data misfit residuals. To implement 


\r\ 

\r\ 


< 

> 


the Huber norm with the previous method, we use the iteratively re-weighting least-squares algorithm 


where the matrix C ee is constructed from the residual of the data i, r % — 


I Vi 


m 


■yi\ 


G. 


, as 


yi 


a; 


C — 

^een — 


yi 


na, 


yi 


r% < 


Ti > 


(13) 
























10 G. Hellio, N. Gillet, C. Bouligand , D. Jault 



Figure 2. Normalized power spectral density of intensity (blue), inclination (black) and declination (red) cal¬ 
culated at co-latitude 6 = 45° with a spherical harmonic truncation TV = 14. The -4 power law is plotted for 
comparison in green. Note that curves are superimposed. 

The Huber norm impacts also the joint probability (equation ([5])). Instead of multiplying the Gaussian 
posterior probability density function of the model by the Gaussian prior probability density function 
of the measurements, we multiply it by the Huber probability density function defined in equation 
( fl2| ). Few iterations are needed to obtain convergency. The use of the Huber norm rescales the weight 
in C ee associated with outliers. We present in the following section synthetic tests for which there is 
no need to use this norm since there are no outliers. In section 4 however, we apply the Huber norm to 
all geophysical datasets. We compare it with the L2-norm for the Syrian series to show how it reduces 
the effect of outliers. 
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3 SYNTHETIC TESTS 


In order to test the Gaussian process regression on observations presenting dating errors (accounted 
for with the MCMC method), we build synthetic datasets of D, I and F that are consistent with an 
AR process of order 2 as defined in section |23| and that display similar characteristics to real archeo¬ 
magnetic datasets in terms of temporal distribution and errors. To this end, we first construct series 
for the period 3000 BC to 2000 AD, sampled every 10 years, using the covariance functions defined 


in equations (A.15) and (A.16). In these covariance functions, the functions K n {r) are defined using 
the variances and correlation times defined in equations ( fT0| ) and ( [IT] ), and the sums are performed 
with a spherical harmonic truncation degree N = 14. We observe that the model is not modified 
when increasing further this truncation degree, and that it is already converged with N — 4. We then 
randomly sub-sample the series and add random measurement and dating errors to each data. These 
errors are built using a Gaussian law for measurement errors and a uniform law for dating errors, 
to mimic the dating uncertainties from historical constraints. We present for comparison the results 
considering Gaussian dating errors. Finally, the measurement and dating errors used in the modeling 
phase correspond to the standard deviation and the half-width of the law used to build them. We report 
in Appendix B, the parameters used for MCMC method for all studied series. We use two different 
datasets consisting of 20 and 50 records respectively with randomly assigned dating and measurement 
errors. Dating errors are generated from a uniform distribution with a half-width of 25 years, and 
measurement errors from a Gaussian distribution with a standard deviation of 1/iT. 

We report in figures [3ja) and[3jb) the obtained pdf of the intensity. We first notice that the distri¬ 
bution always encompasses the true series (black curve). In the case where 20 data only are available, 
the sharp changes present in the true series are not closely recovered by the pdf due to the lack of 
data, and the range of estimates is wide except during the few time intervals that are well sampled. 
Increasing the quantity of synthetic observations dramatically improves the fit of the pdf to the true 
series and narrows the distribution (see figure [3(b)] ). 

In figure |3(c)l we invert the same dataset as in figure |3(a)} here again noised following uniform and 
Gaussian laws with respectively 25 years half-width and 1 pT standard deviation. However, following 
the strategy used by [Donadini et al.| ( |2009| ), we assign in the inversion a minimum threshold value for 
measurement errors, that replaces error estimates lower than this minimal value, chosen to be 5 pT 
for intensities. The distribution is significantly affected by this process, the dispersion happens to be 
strongly increased particularly when data are available. We conclude here that this way of handling 
small measurement errors penalizes accurate data and leads to lose information. In figure |3(d)[ we 
invert the same dataset as in figure |3(a)| but after multiplying dating errors by a factor of ten. The 
dispersion is then a lot wider for the whole studied period. 
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For all the precedent cases, the dating errors are supposed uniform what is mostly the case for 
archeomagnetic objects. However, some of them are dated by radiocarbon methods, which can lead to 
Gaussian or more complicated error distributions. Figure [3(e)] presents the obtained pdf when dating 


errors are assumed Gaussian for the inversion of the same dataset as in figure |3(d)| We see that the 
resulting pdf are rather similar, although Gaussian dating errors slightly increase the pdf when ob¬ 
servations are available. These tests illustrate the importance of assigning realistic error bars for both 
dating and measurement errors. Furthermore, it shows that our method, where the posterior covari¬ 
ance matrix is used to estimate the model error, is capable of accounting for a realistic measure of 
the information contained into geomagnetic observations, and thus avoids reducing the importance of 
relatively more accurate records. 

We have evaluated the importance of considering covariances between intensity and inclination 
within synthetic tests but have not seen significant differences while inverting jointly or separately 
these observations. Further on, covariances between F and / are considered. 
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Figure 3. Probability density function of the intensity obtained from a synthetic dataset sampled from a true 
series (black), a) 20 data, correct errors used in the inversion ; b) 50 data, correct errors used in the inversion ; c) 
same as (a) but all measurement errors smaller than 5/iT have been converted to 5/iT ; d) same as (a) but with 
dating errors multiplied by a factor of ten ; e) Same as d) but assuming Gaussian dating errors. The half width of 
the uniform law a u , has been transformed into the standard deviation of the Gaussian law a g = a u / y/3 so that 
the uniform and Gaussian law have the same standard deviation. The standard deviation of the Gaussian law 
and the half width of the uniform law used to generate random measurement and dating errors are set randomly 
over the dataset with mean values of 1 /xT for measurement errors and 25 yrs (a), (b), and (c) or 250 yrs (d) and 
(e) for dating uncertainties. 
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4 APPLICATION TO DATA SETS FROM SYRIA AND FRANCE 


In this section, we present results for Mari (Syria) and Paris (France), obtained from intensity data in 
Syria for epochs between 4000 BC and 0 and from directional and intensity data in France for epochs 
between 0 and 1900 AD. For directions, we have converted the 95% cone of confidence ( 0 ^ 95 ) onto 
declination crp and inclination 07 errors ( |Piper|1989 ): 


81 


81 


a D = 


—---« 95 ;ctj = 777:^95 

140 cos / 140 


(14) 


4.1 Archeointensity data from the Middle-East 


The dataset used here comprises 39 intensity values for Syria (Gen evey et al.|2003||Gallet et al.|2 006 


2008; Gallet and A1 Maqdissi 2010). All data are reduced to Mari in Syria using the geomagnetic 
axial dipole (GAD) hypothesis. The error caused by the reduction is small compared to measurement 
errors. A priori information is built from a magnetic field model truncated at spherical harmonic degree 

N = 4. 

Results are displayed in figure[4ja) for the Syrian dataset. The distribution is narrow between 2700 
and 1600 BC due to the numerous data present during this period. Local maxima appear resolved in 
2500, 2250, 1450 and 650 BC. The distribution prevents us from concluding about extrema value 
around 3200 BC. Next, we have augmented the dataset with new archeointensity data from Syria 
( [Gallet and Le Goff||2006[ |Gallet and Butterlin1|2014[ |Gallet et ~aT7||2014] ), and data from the southern 
Levantine region and Iran ( Ben-Yosef et al.|2008 2009[ |Ertepinar et al.|2012 , Shaar et al.|2011| ). The 
new data are plotted in blue. Note that the dataset used here comprises more data than the expanded 
dataset used by Thebault and Gallet ( 2010| ). Study of the distribution obtained from this expanded 
dataset confirms the maxima inferred in 2500 and 2250 BC, figure |4jb). Two sharp maxima appear 
in 1000 and 650 BC. On figure [4^), we remark a wide dispersion around 3200 BC. The few data 
added between 3500 and 3000 BC, despite very large uncertainties, point to a maximum in 3400 BC 
followed by a local minimum in 3200 BC although the distribution is still wide. Increasing the number 
of observations refines the distribution. We see that a mean model in Figure [4j a), would not predict 
the behavior observed with the expanded dataset, the reason why we use probability density functions 
to represent the results. 

Our modeling strategy differs from the iterative inverse method previously developed by [Thebault 
and Gallet|( |20T0| . The latter consists in a projection onto cubic B-splines, penalizing the second time 
derivative, together with a bootstrap strategy to handle dating and measurement errors. Our results 
for the restricted dataset (figure |4ja)) present more rapid variations. Particularly for the two maxima 
of 2250 and 2500 BC which are well defined in our study and confirmed by the recent observations, 
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Figure 4. Probability density function from intensity records a) from Syria alone and from the entire Levantine 
region b) using Huber-norm, c) using L2-norm. Remanence intensities have been transferred to the site of Mari 
(34°N, 40°E) via the geomagnetic axial dipole hypothesis. 
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(a) 3700 BC (b) 1500 BC (c) 50 BC 

Figure 5. Probability density functions of the intensity of three cross-sections in 3700, 1500 and 50 BC for the 
extended dataset. 


whereas the master curve in Thebault and Gallet ( |2010| ) is flat for this period. In comparison, our 
distribution presents also a wider dispersion, particularly when data are sparse. 

We show in Figure |4jc) the results for the expanded dataset when using the L2-norm. Sharp 
local maxima appear now around 800, 1700, 2800 BC which are not apparent in figure |4jb). This 
behavior illustrates a common issue in archeomagnetic modeling. Even when the method accounts for 
all uncertainties present in the dataset, some incompatibilities within the dataset cannot be handled. 
One record appearing in 1785 BC has small dating and measurement errors so it forces alone a sharp 
variation of the model. A rejection criterium has been used in Thebault and Gallet ( |2010| ) to tackle this 
issue. We see that the Huber norm alleviates also this difficulty still allowing these data to possibly keep 
some influence through the MCMC sampling. Here, we show the importance of assigning realistic 
measurement and dating errors to all data. 

Note also that the posterior distribution is not necessarily Gaussian. Figure [5] shows three pdf 
of the intensity estimated in 3700, 1500 and 50 BC. We see that the distributions can be similar to 
Laplacian distribution ([5^), Gaussian distribution ([5]:) or multi-modal distributions ([ 5 J 3 ). This finding 
makes awkward the definition of a mean model, the reason why we only consider pdf and not master 
curves. 

Finally, an important result of our method is the posterior probability on dates. These distributions 
are very different from their a priori uniform distribution. We focus on five data of the extended dataset 
(see colored error bars in Figure [6j a)) and show histograms of the dates preferentially selected in the 
Markov chains (Figure |6jb-f)). The distribution of dates in figure [6^d) is very different from a uniform 
distribution: very few dates appear before 1680 BC and the highest probability for this date is for 
epochs younger than 1650 BC. Figure |6jf) displays a multi-modal distribution that makes unlikely 
epochs around 1450 BC. This methodology can be used to refine the pdf of record dates. 
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Figure 6. b), c), d), e), f): Distribution of the dates after MCMC colored respectively in a). 

4.2 Direction and intensity of the magnetic field in Paris 

We use in this section directional data collated by Bucur1 fT994| , and some of the intensity data pre¬ 


sented in |Genevey et al.| ( [2013| ) for France. We adopt the quality criteria of |Genevey et al.| (|2009) and 
keep only data with age uncertainties lower than 100 years, acquired using the Thellier and Thellier 
method with pTRM-check and with a minimum of three results per site. The dataset finally contains 
119 directional values and 104 intensity measurements. All of them have been reduced to Paris using 
virtual geomagnetic poles derived from the GAD hypothesis. Again, the error caused by the reduction 
is small compared to measurement errors. MCMC parameters are summarized in Appendix B. We 
need less chains than for the Syrian study due to the smaller dating errors. We display in figure [ 7 ] the 
pdf for D, I and F. The intensity series present a general decrease from 850 to 1800 AD, with a local 
maximum in 1350 AD. Data coverage is particularly sparse between 500 and 700 AD, which implies 
a wide dispersion during this period. A maximum close to 80//T appears clearly defined in 850 AD. 
Our results present similar features in comparison with those of Genevey et al.)( |2013| ), except for the 
local maximum around 1600 AD that does not exist in our study. 

Predictions from the ARCH3k global model (Korte and Constable 2011| ) and from the A-FM 
global model (Licht et al.i [2013[ ) are superimposed in figure [ 7 ] for comparison, in blue and green re¬ 
spectively. The models are in good agreement for declination series except for periods between 600 
and 850 AD. For inclination however, the high values found at the end of the IX th century are not 
accounted for by the ARCH3k and the A-FM models. The intensity minimum found in our study 
around 1700 AD is not accounted for by the global models. The intensity maximum appears in both 
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models but is slightly sharper in our model and delayed towards recent epochs. This can be due to 
the penalization of second-time derivatives in the ARCH3k and A-FM models, which may filter out 
locally well documented rapid variations in order to avoid spurious oscillations elsewhere or to the 
fact that this model incorporates globally distributed data. 

The a priori information on the model clearly emerges at epoch for which no data are available. In 
this study, it particularly appears at the end of the studied time interval for declination and inclination. 
There, the model pdf is controlled by the a priori correlation function which ensures the continuity of 
the first time derivative through the AR-2 process. 
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Figure 7. Probability density functions of declination, inclination and intensity records from France. All data 
have been reduced to Paris (48.9°N, 2.3°E). The blue curve represents the prediction from ARCH3k and the 
green curve the prediction from A-FM with their respective 68% confidence interval (dashed lines). 
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5 CONCLUSION 


In this study, we have developed a new method for the construction of archeomagnetic pdf from 
inclination, declination, and intensity data. Our method is based on Gaussian process regression and 
it incorporates a priori information consistent with the statistics obtained from historical geomagnetic 
data. Markov Chain Monte Carlo applied on the dates of observations selects random distribution 
of dates with the highest probabilities. The Huber norm is applied to deal efficiently with outliers. 
This new method has several advantages: first it avoids the use of arbitrary regularization, and any 
unspecified filtering introduced by the projection onto support functions such as cubic B-splines; it 
furthermore allows to account for dating errors in a probabilistic framework. 

We first try our method on synthetic datasets constructed from AR-2 process series. Our tests 
illustrate the importance of using correct estimates of the dating and measurement errors in the inver¬ 
sion in order to optimally recover the a posteriori errors on model parameters. They also show that 
our method is capable of accounting for data displaying disparate accuracies, without losing infor¬ 
mation contained into the highest quality records. The application of this newly developed method to 
European datasets provides pdf that display rapid fluctuations. These are less smooth than changes 
obtained from regularized global (e.g., |Korte et al.|2009| ) or regional (e.g., Thebault and Gallet|2010| ) 
models. The pdf together with the posterior probability of the record dates may be useful for a purpose 
of archeomagnetic dating. 

We find particularly interesting the use of the MCMC method in order to efficiently explore the 
space of possible record dates, as we observe that naive random sampling yields largely disparate 
probabilities for the different sets of dates. We now plan to extend our method to global models. In 
this context, efficient sampling is crucial. 

In the present study we employ the simplest AR-2 stochastic process that mimics well high fre¬ 
quency variations of the field. Over longer periods, a -2 slope temporal power spectral density has 
been put forward ( |Panovska et al.||2013| ). Such a slope is consistent with the identification of archeo¬ 
magnetic jerks ( [Gallet et~aL||2003| ). It has motivated the introduction of AR-I stochastic process in 
the modeling of long period changes of the magnetic field ( Brend eTet al.||2007 , Buffett et al.||2013| ). 
Alternative AR-2 processes may be employed to represent the two behaviors on short (5-100 years) 
and long (300-10,000 years) periods. Consider for instance the damped oscillator process ( |Yaglom| 
2004, eq 2.155’), governed by stochastic equations depending on two parameters and of the general 


form: 


d~(U + ^ a ^ = rf C(*) 


(15) 


The Matern AR-2 process used in this study corresponds to the case a = cj. Using instead 2 a > cj 2 
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one can mimic both the -2 slope temporal power spectral density found for the dipole moment at 
periods up to approximately 10 5 yrs from the analysis of geomagnetic records (Constable and Johnson 


2005| ), and retrieved in geodynamo simulations (Olson et al. 2012| ), and the -4 slope observed at shorter 


periods. This could be an interesting alternative given the cyclic behavior found for the dipole tilt at 
millennial periods (Nilsson et al. |2011 ). 
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APPENDIX A: D, I AND F COVARIANCES 


In this Appendix, we derive the statistical properties (i.e. mean values and covariances) of the inclina¬ 
tion /, declination D , and intensity F of the magnetic field at a location of longitude 0 and colatitude 
0 at the surface of the Earth. We assume that the Gauss coefficients describing the magnetic field are 
the result of a random stationary process, are characterized by a null mean value (except for the ax¬ 
ial dipole, whose mean value is noted g\\ are independent from each other, and have a covariance 
function that depends only on degree n: 


Co v(<CW,C(* + r)) = Cov(C(*), K(t + r)) = K n (r) 


(A.l) 


Such assumptions amount to impose that the statistical properties of the deviation of the magnetic 
field from an axial dipole are invariant over the surface of the Earth (as demonstrated in ( Hulot and 
Bouligand 2005| ». 

We first derive the statistical properties of the north X , east Y, and downward Z components of 
the magnetic field. Their expressions (for a truncation degree N) at the surface of the Earth are (e.g. 
Langel[1987P : 
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Because only g® has a non-zero mean value, the mean values of X , Y, and Z are simply 


X — —g\ sin 9 ; Y — 0 ; Z — —2 g\ cos 6 . 


(A.3) 


Because of the independence of Gauss coefficients, and because their covariance function depends 
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only on the spherical harmonic degree n, covariances on X, Y and Z simplify into : 


N 


Cov(X(t),X(i + r)) =Y J K n {r)Y J 


dP™ (cos 0) 


77=1 777=0 

N 77 


d9 


i 

Co v(Y(t), Y(t + r)) K n (r) £ m 2 (P™(cos0)) 2 


sim 9 

N 


77=1 


777=0 

77 


Cov(Z(*),Z(t + r)) =^(n + l) 2 ^ n (r)y](P n m (cos0)) 2 . (A.4) 

77=1 777 =0 

Cov(X (t),Y(t + r)) =0 
Cov(y(7),Z(7 + r)) =0 

Cov(X(t), Z(< + t)) = - + 1)K„(t) ■£ d -^P^-Pn(c <*«) 

77=1 777 =0 

Such expressions can be further simplified using the following relations for Schmidt normalized asso¬ 
ciated Legendre functions ([Winch and Roberts[1995]): 
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We therefore deduce that: 
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Co v(Z(t),Z(7 + r)) = + l) 2 X n (r) 

77=1 

and that the series of X, F, and Z recorded at a same location are independent from each other : 

Cov(X(£),F(£ + r)) = Cov(F(i), Z(t + r)) = Cov(X(t), Z(t + r)) = 0 (A.7) 

The declination 79, inclination 7 and intensity F of the magnetic field are not linearly related to 
the components X , Y, and Z: 


Y Z 

D = arctan — ; I — arctan , — 

x vx 2 + y 2 


; F = VX 2 + X 2 + Z 2 . 


(A.8) 


Let us denote A = (X, F, Z) and B = (79, /, F). If the vector A does not depart much from its mean 
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value A (corresponding to the mean axial dipole), the above non-linear relations, noted B = ^(A), 
can be approximated using a first-order Taylor expansion : 


Bk 


M A ) + 


d^h 
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_ (Ai — A). 
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(A.9) 


The mean value of B is therefore approximated by: 


Bk = ^fc(A) 


(A. 10) 


Combining equations ( |A. 10| >, ( |A.3| >, and ( |A.S[ ). we obtain the expression for the mean value of D, I, 
and F: 

D = 0 (vrif g\ > 0) ; I = -sgn(p°) arctan 5 F = Iff?I \A + 3cos 2 9 , (A. 11) 

and the covariance matrix for B is approximated by: 
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(A. 12) 


Because the series of X , Y, and Z are independent of each other, this expression can be simplified 
into: 


Co ^B k (t),Bi{t + T)) = Y{jX. 


Co v(Ai(t),Ai(t + r)) . 


(A.13) 


This expression involves the partial derivative of D , /, and F with respect to X, Y, and Z evaluated 
at (X,Y,Z) : 
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(A. 14) 

Finally, combining equations ( |A.13| ), ( |A.6| ), and ( |A.14| ), we obtain the following approximated expres¬ 
sions for the covariances of D, /, and F : 
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and the cross-covariances within the different quantities: 


Co v(D(t),I(t + t)) = 
Cov(/(t), F(t + r)) = 

Cov(7(f), F(t + r)) = 
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cos 9 sin 9 / „ w T , . . 
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(A. 16) 
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APPENDIX B: PARAMETERS USED IN THE MCMC METHOD. NUMBER N OF 
ITERATIONS PER CHAIN, ctmcmc AND NUMBER iV MC MC OF DRAWS SELECTED BY 
THE MARKOV RULES. THE NUMBER OF LINES CORRESPONDS TO THE NUMBER 
OF CHAINS USED FOR EACH FIGURE. 
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Fig. 4(a) 

10000 

30 

2980 

Figs. 7(b and c) 
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Fig. 4(b) 
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